# Long Audio Processing
Whisper Large V3 Turbo
MIT
Whisper is OpenAI's state-of-the-art automatic speech recognition (ASR) and speech translation model, trained on over 5 million hours of labeled data with strong zero-shot generalization capabilities. The Turbo version is a pruned and fine-tuned variant of the original, reducing decoder layers from 32 to 4, significantly improving speed with a slight quality trade-off.
Speech Recognition
Transformers Supports Multiple Languages

W
unsloth
94
1
Whisper Large V3
Apache-2.0
Whisper is OpenAI's state-of-the-art automatic speech recognition (ASR) and speech translation model, supporting multiple languages
Speech Recognition
Safetensors Supports Multiple Languages
W
unsloth
4,002
1
Quantum STT
Apache-2.0
Quantum_STT is an advanced automatic speech recognition (ASR) and speech translation model, trained with large-scale weak supervision, supporting multiple languages and tasks.
Speech Recognition
Transformers Supports Multiple Languages

Q
sbapan41
100
1
Whisper Large V3 Turbo Gguf
MIT
Whisper large-v3-turbo is a pruned and fine-tuned version based on Whisper large-v3, with the decoder layers reduced from 32 to 4, significantly improving speed while slightly reducing quality.
Speech Recognition Supports Multiple Languages
W
xkeyC
546
1
Whisper Small Tel
Apache-2.0
A speech recognition model fine-tuned on Telugu audio datasets based on OpenAI Whisper-large-v2
Speech Recognition
Transformers Other

W
sagarchapara
17
1
Distil Large V3.5
MIT
Distil-Whisper is a knowledge-distilled version of OpenAI Whisper-Large-v3, achieving efficient speech recognition through large-scale pseudo-label training.
Speech Recognition
Transformers English

D
distil-whisper
4,804
25
Whisper Large V3 Turbo Common Voice 19 0 Zh TW
MIT
A fine-tuned Traditional Chinese (Taiwan) automatic speech recognition model based on OpenAI Whisper-large-v3-turbo
Speech Recognition
Transformers Chinese

W
JacobLinCool
220
4
Whisper Large V3 Turbo
MIT
Whisper is a state-of-the-art automatic speech recognition (ASR) and speech translation model developed by OpenAI, trained on over 5 million hours of labeled data, demonstrating strong generalization capabilities in zero-shot settings.
Speech Recognition
Transformers Supports Multiple Languages

W
openai
4.0M
2,317
Kotoba Whisper V2.0 Faster
MIT
A Whisper speech recognition model optimized for CTranslate2, specifically tailored for Japanese, providing efficient speech-to-text functionality.
Speech Recognition Japanese
K
kotoba-tech
202
14
Audio Transcribe
This is a Transformer-based Automatic Speech Recognition (ASR) model for transcribing audio files into text.
Speech Recognition
A
washeed
257
4
Distil Small.en
MIT
Distil-Whisper is a distilled version of the Whisper model, 6x faster with 49% smaller size, achieving near 1% WER on out-of-distribution evaluation sets.
Speech Recognition
Transformers English

D
distil-whisper
33.51k
97
Whisper Large V3 German
Apache-2.0
A fine-tuned German speech recognition model based on Whisper Large v3, optimized for German speech processing and recognition
Speech Recognition
Transformers German

W
primeline
8,745
70
Distil Medium.en
MIT
Distil-Whisper is a distilled version of the Whisper model, 6 times faster than the original, with a 49% reduction in size, while maintaining performance close to the original in English speech recognition tasks.
Speech Recognition English
D
distil-whisper
186.85k
120
Distil Large V2
MIT
Distil-Whisper is a distilled version of the Whisper model, achieving 6x speedup and 49% size reduction with only a 1% WER difference on out-of-distribution evaluation sets.
Speech Recognition English
D
distil-whisper
42.65k
508
Featured Recommended AI Models